import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import stanio
from cmdstanpy import CmdStanModelNo entry for terminal type "xterm-color";
using dumb terminal settings.
Regression is a nice way of modelling measurements.
Maybe you have seen this regression model before:
\[ y = \alpha + X\cdot\beta + \epsilon \]
Components:
Often \(\epsilon\) is assumed to have a normal distribution with location zero and scale parameter \(\sigma\).
Another way of saying the same thing:
\[ y \sim N(\alpha + X\cdot\beta, \sigma) \]
I prefer this notation because it nicely separates the different components, and doesn’t hide the error parameter \(\sigma\). It also makes it clear what you might change. For example \(\alpha + X\cdot\beta\) is just one option: there are many ways in which measurements and predictors can be related. The normal distribution is also not required, in fact it is often inappropriate.
The key features of a regression model:
In the context of regression modelling, Bayesian inference lets you give free rein to your creativity when designing regressions, so you can make models that represent what you know about the measurement process.
This is a popular and powerful class of regression model with the following distinguishing characteristics:
The predictor is a linear function of the parameters and features, e.g. \(\alpha + X\cdot\beta\)
The probability distribution describing measurement errors is not necessarily the normal distribution, e.g. log-normal, Poisson, Dirichlet, Gamma, …
There is a link function that connects the linear predictor with the probability distribution, e.g. \(\ln(\alpha + X\cdot\beta)\).
An example GLM for positive-constrained measurements:
\[ y \sim LN(\ln(\alpha+X\cdot\beta), \sigma) \]
A practical guide to regression modelling: Gelman, Hill, and Vehtari (2020)
The section of the Stan user’s guide on regression models is really nice.
Modern Statistics for Modern Biology is an online (and physical) textbook with some very good material about biology-relevant regression models: Susan Holmes and Wolfgang Huber (2019).
Teddy has never been in the lab and is bad at pipetting. Unfortunately, some label needed to be put in some Eppendorf tubes, and noone else was available to do it themselves or even supervise.
Each tube had a required volume of label which Teddy tried to hit, but probably there was some inaccuracy due to bad eyesight, poor hand control or something.
In addition, there was also probably some consistent bias, as the pipette was consistently adding too much or too little liquid. It seemed pretty old.
Luckily, someone was able to measure how much label ended up in 8 out of the 24 tubes with pretty good accuracy. Now we want to estimate how much label there is in the other 16 tubes, taking into account these measurements as well as what we know about the likely biased pipette and inconsistent pipetter.
To describe this situation we’ll first think of a regression model that describes the measurement setup, then use Python to simulate data from the model given some plausible parameter values. Next we’ll implement the model in Stan, then fit the simulated data using MCMC and then analyse the results.
To model the noise that Teddy introduced by being bad at pipetting and the bias introduced by the bad pipette, we need some parameters that connect the known target volumes with the unknown true volumes. Let’s call them \(noise\) and \(bias\). Since the volumes are constrained positive, a distribution that automatically excludes negative numbers is probably a good idea: the log-normal distribution is usually a good starting point. This equation describes a plausible relationship:
\[ volume \sim LN(\ln{(target\cdot bias)}, noise) \]
To model the helpful measurements, we use another log-normal distribution and assume the measuring device is unbiased and has known log-scale standard error \(cal\ error\):1
1 NB the scale parameter of a lognormal distribution represents multiplicative error
\[ measurements \sim LN(\ln{volume}, cal\ error) \]
To round off the model we can think about the likely values of the unknown parameters \(bias\) and \(noise\). \(bias\) is likely not to be too far away from 1, otherwise someone would have probably thrown the pipette away already. A prior distribution that puts most of its mass between 0.75 and 1.25 therefore seems reasonable. Similarly, a prior for \(noise\) should probably not imply that Teddy’s pipetting errors regularly exceeded 30%. This consideration motivates a prior for \(noise\) that puts most of its mass below 0.15.
First some imports: as usual for this course we’ll be using arviz, matplotlib, cmdstanpy, pandas and numpy. stanio is a handy utility library for Stan: it is a dependency of cmdstanpy so you shouldn’t need to install it explicitly.
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import stanio
from cmdstanpy import CmdStanModelNo entry for terminal type "xterm-color";
using dumb terminal settings.
Now some hardcoded numbers, including true values for the parameters here: \(bias\) is 0.88 and \(noise\) is 0.1. Note that \(cal\ error\) is much smaller than \(bias\).
Simulate the true volumes
rng = np.random.default_rng(seed=RNG_SEED)
ln_mean = [
np.log(target * BIAS_FACTOR)
for target in TARGET_VOLUMES
]
volumes = rng.lognormal(mean=ln_mean, sigma=NOISE)
volumesarray([152.64294349, 199.70810807, 161.32449302, 171.49715397,
174.67894069, 163.43175945, 153.50063823, 187.79919405,
364.94147086, 289.55488638, 445.13256694, 387.79655766,
326.2592979 , 385.23402356, 335.94110379, 349.87019831,
761.78380169, 620.86368119, 745.73037525, 809.71007835,
803.52489045, 683.21425317, 770.52360505, 598.61585552])
Plot the volumes and the targets.
f, ax = plt.subplots()
bins = np.logspace(np.log10(100), np.log10(1000), 30)
ax.hist(volumes, bins=bins)
for t in (200, 400, 800):
ax.axvline(t, color="red")
ax.semilogx()
ax.set_xticks([200, 400, 800], [200, 400, 800]);
ax.set(
xlabel="volume ($\\mu$l)",
ylabel="Frequency",
title="How much label ended up in the tubes"
);Simulate measurements for tubes in the MEASUREMENT_IX.
measurements = [
rng.lognormal(np.log(vol), CAL_ERROR)
for vol in volumes[MEASUREMENT_IX]
]
pd.DataFrame({
"target volume": np.array(TARGET_VOLUMES)[MEASUREMENT_IX],
"actual volume": volumes[MEASUREMENT_IX],
"measured volume": measurements
})| target volume | actual volume | measured volume | |
|---|---|---|---|
| 0 | 200 | 199.708108 | 199.077273 |
| 1 | 200 | 174.678941 | 176.256328 |
| 2 | 400 | 289.554886 | 281.877576 |
| 3 | 400 | 387.796558 | 387.163512 |
| 4 | 400 | 349.870198 | 362.149468 |
| 5 | 800 | 809.710078 | 853.238785 |
| 6 | 800 | 683.214253 | 693.919342 |
| 7 | 800 | 770.523605 | 783.399634 |
I wrote up the implied statistical model in a Stan file at src/stan/ pipette.stan. This code loads this Stan file as a CmdStanModel object, checks its formatting and prints it out.
Note that the model internally puts the data on log scale and then standardises it: this is a bit annoying but makes it way easier to set priors and can ultimately save you a lot of trouble.
model = CmdStanModel(stan_file="../src/stan/pipette.stan")
model.format(overwrite_file=True, canonicalize=True)
print(model.code())09:24:19 - cmdstanpy - INFO - compiling stan file /home/nicholas/courses/bayesian_statistics_for_computational_biology/src/stan/pipette.stan to exe file /home/nicholas/courses/bayesian_statistics_for_computational_biology/src/stan/pipette
09:24:30 - cmdstanpy - INFO - compiled model executable: /home/nicholas/courses/bayesian_statistics_for_computational_biology/src/stan/pipette
functions {
vector standardise(vector v, real m, real s) {
return (v - m) / s;
}
real standardise(real v, real m, real s) {
return (v - m) / s;
}
vector unstandardise(vector u, real m, real s) {
return m + u * s;
}
real unstandardise(real u, real m, real s) {
return m + u * s;
}
}
data {
int<lower=1> N;
int<lower=0> N_cal;
vector<lower=0>[N] target_volume;
vector<lower=0>[N_cal] y;
array[N_cal] int<lower=1, upper=N> measurement_ix;
real<lower=0> cal_error;
int<lower=0, upper=1> likelihood;
}
transformed data {
vector[N_cal] y_ls = standardise(log(y), mean(log(y)), sd(log(y)));
vector[N] target_volume_ls = standardise(log(target_volume), mean(log(y)),
sd(log(y)));
real cal_error_s = cal_error / sd(log(y));
}
parameters {
real<lower=0> volume_noise_s;
real bias_factor_l;
vector[N] volume_ls;
}
model {
volume_noise_s ~ lognormal(log(0.1), 0.5);
bias_factor_l ~ normal(0, 0.15);
volume_ls ~ normal(target_volume_ls + bias_factor_l, volume_noise_s);
if (likelihood) {
for (i in 1 : N_cal) {
y_ls[i] ~ normal(volume_ls[measurement_ix[i]], cal_error_s);
}
}
}
generated quantities {
real bias_factor = exp(bias_factor_l);
real volume_noise = volume_noise_s * sd(log(y));
vector[N] volume = exp(unstandardise(volume_ls, mean(log(y)), sd(log(y))));
vector[N_cal] y_rep;
vector[N_cal] llik;
for (i in 1 : N_cal) {
int ms_ix = measurement_ix[i];
y_rep[i] = lognormal_rng(log(volume[ms_ix]), cal_error);
llik[i] = lognormal_lpdf(y[i] | log(volume[ms_ix]), cal_error);
}
}
This code loads some data into a dictionary that is compatible with Stan and carries out two MCMC runs, one in prior mode and one in posterior mode.
stan_input_posterior = stanio.json.process_dictionary(
{
"N": N,
"N_cal": N_CAL,
"target_volume": TARGET_VOLUMES,
"y": measurements,
"measurement_ix": MEASUREMENT_IX + 1,
"cal_error": CAL_ERROR,
"likelihood": 1,
}
)
stan_input_prior = stan_input_posterior | {"likelihood": 0}
mcmc_prior = model.sample(
data=stan_input_prior,
adapt_delta=0.999,
max_treedepth=12,
seed=RNG_SEED,
)
mcmc_posterior = model.sample(data=stan_input_posterior, seed=RNG_SEED)
mcmc_prior.diagnose()
mcmc_posterior.diagnose()09:24:30 - cmdstanpy - INFO - CmdStan start processing
09:24:30 - cmdstanpy - INFO - CmdStan done processing.
09:24:30 - cmdstanpy - INFO - CmdStan start processing
09:24:30 - cmdstanpy - INFO - CmdStan done processing.
09:24:30 - cmdstanpy - WARNING - Non-fatal error during sampling:
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'pipette.stan', line 38, column 2 to column 71)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'pipette.stan', line 38, column 2 to column 71)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'pipette.stan', line 38, column 2 to column 71)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'pipette.stan', line 38, column 2 to column 71)
Consider re-running with show_console=True if the above output is unclear!
'Processing csv files: /tmp/tmpok6geivi/pipettem6y5cnhe/pipette-20240430092430_1.csv, /tmp/tmpok6geivi/pipettem6y5cnhe/pipette-20240430092430_2.csv, /tmp/tmpok6geivi/pipettem6y5cnhe/pipette-20240430092430_3.csv, /tmp/tmpok6geivi/pipettem6y5cnhe/pipette-20240430092430_4.csv\n\nChecking sampler transitions treedepth.\nTreedepth satisfactory for all transitions.\n\nChecking sampler transitions for divergences.\nNo divergent transitions found.\n\nChecking E-BFMI - sampler transitions HMC potential energy.\nE-BFMI satisfactory.\n\nEffective sample size satisfactory.\n\nSplit R-hat values satisfactory all parameters.\n\nProcessing complete, no problems detected.\n'
The diagnostics seem ok, though interestingly the prior was pretty tricky to sample accurately.
This code loads both MCMC runs into an arviz InferenceData object.
coords = {"obs": MEASUREMENT_IX, "tube": range(N)}
dims={
"y": ["obs"],
"y_rep": ["obs"],
"target_volume": ["tube"],
"true_volume": ["tube"],
"volume": ["tube"],
"tube": ["tube"]
}
idata = az.from_cmdstanpy(
posterior=mcmc_posterior,
prior=mcmc_prior,
log_likelihood="llik",
observed_data=stan_input_posterior | {
"true_volume": volumes, "tube": range(N)
},
posterior_predictive={"y": "y_rep"},
coords=coords,
dims=dims
)
idata<xarray.Dataset> Size: 2MB
Dimensions: (chain: 4, draw: 1000, volume_ls_dim_0: 24, tube: 24)
Coordinates:
* chain (chain) int64 32B 0 1 2 3
* draw (draw) int64 8kB 0 1 2 3 4 5 6 ... 994 995 996 997 998 999
* volume_ls_dim_0 (volume_ls_dim_0) int64 192B 0 1 2 3 4 5 ... 19 20 21 22 23
* tube (tube) int64 192B 0 1 2 3 4 5 6 7 ... 17 18 19 20 21 22 23
Data variables:
volume_noise_s (chain, draw) float64 32kB 0.1904 0.1868 ... 0.2049 0.1824
bias_factor_l (chain, draw) float64 32kB -0.1112 -0.1494 ... -0.1279
volume_ls (chain, draw, volume_ls_dim_0) float64 768kB -1.474 ... ...
bias_factor (chain, draw) float64 32kB 0.8947 0.8613 ... 0.8381 0.8799
volume_noise (chain, draw) float64 32kB 0.1165 0.1143 ... 0.1254 0.1116
volume (chain, draw, tube) float64 768kB 161.9 196.3 ... 777.0
Attributes:
created_at: 2024-04-30T07:24:30.931538
arviz_version: 0.17.0
inference_library: cmdstanpy
inference_library_version: 1.2.1array([0, 1, 2, 3])
array([ 0, 1, 2, ..., 997, 998, 999])
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23])array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23])array([[0.190363, 0.186822, 0.134551, ..., 0.150411, 0.125601, 0.164536],
[0.135635, 0.160475, 0.180948, ..., 0.161191, 0.226544, 0.208362],
[0.181415, 0.204897, 0.159819, ..., 0.24368 , 0.170485, 0.213715],
[0.165047, 0.156805, 0.128412, ..., 0.170275, 0.204871, 0.182446]])array([[-0.111228 , -0.149368 , -0.110311 , ..., -0.0468771, -0.138876 ,
-0.0922775],
[-0.121281 , -0.109771 , -0.0926319, ..., -0.0904757, -0.158592 ,
-0.14872 ],
[-0.178611 , -0.128676 , -0.085717 , ..., 0.0889216, -0.211837 ,
-0.197323 ],
[-0.0835126, -0.13089 , -0.154868 , ..., -0.141769 , -0.176633 ,
-0.127938 ]])array([[[-1.47383 , -1.15921 , -1.21367 , ..., 0.969309, 1.0606 ,
1.02329 ],
[-1.26492 , -1.0866 , -1.42661 , ..., 0.862963, 1.13913 ,
0.870445],
[-1.44251 , -1.13697 , -1.26185 , ..., 0.926286, 1.08846 ,
0.963761],
...,
[-1.15171 , -1.13628 , -1.10095 , ..., 0.962168, 1.09349 ,
1.03624 ],
[-1.21334 , -1.22996 , -1.25696 , ..., 0.955795, 1.06646 ,
1.15667 ],
[-1.15451 , -1.15651 , -1.17656 , ..., 0.910885, 1.09359 ,
0.983682]],
[[-1.30735 , -1.14759 , -1.32117 , ..., 0.902283, 1.10736 ,
0.929398],
[-1.2908 , -1.12976 , -1.10165 , ..., 0.931843, 1.09585 ,
0.832189],
[-0.977239, -1.10896 , -1.37798 , ..., 0.916567, 1.03758 ,
0.874194],
...
[-0.846767, -1.16406 , -0.760256, ..., 0.925559, 1.07899 ,
1.06433 ],
[-1.61639 , -1.11348 , -1.81811 , ..., 0.885766, 1.12757 ,
1.00389 ],
[-1.27247 , -1.08085 , -1.36793 , ..., 0.880418, 1.05117 ,
0.871703]],
[[-1.17645 , -1.19121 , -1.20238 , ..., 0.950307, 1.09967 ,
1.29284 ],
[-1.30361 , -1.05745 , -1.27809 , ..., 0.865576, 1.11618 ,
0.792592],
[-1.13784 , -1.17576 , -1.29677 , ..., 0.901705, 1.06212 ,
1.23264 ],
...,
[-1.12664 , -1.1284 , -1.06767 , ..., 0.834501, 1.07771 ,
0.970031],
[-1.47313 , -1.13204 , -1.53602 , ..., 0.952897, 1.12532 ,
0.986104],
[-1.38305 , -1.10899 , -1.49591 , ..., 0.942392, 1.10633 ,
1.0888 ]]])array([[0.894735, 0.861252, 0.895555, ..., 0.954205, 0.870336, 0.911852],
[0.885785, 0.896039, 0.911529, ..., 0.913497, 0.853345, 0.86181 ],
[0.836431, 0.879259, 0.917854, ..., 1.093 , 0.809097, 0.820925],
[0.919879, 0.877314, 0.856528, ..., 0.867821, 0.838087, 0.879908]])array([[0.116487 , 0.11432 , 0.0823346, ..., 0.0920395, 0.0768581,
0.100683 ],
[0.0829981, 0.0981982, 0.110726 , ..., 0.098636 , 0.138627 ,
0.127501 ],
[0.111011 , 0.125381 , 0.0977966, ..., 0.149113 , 0.104323 ,
0.130776 ],
[0.100996 , 0.0959523, 0.078578 , ..., 0.104195 , 0.125365 ,
0.111642 ]])array([[[161.946, 196.328, 189.894, ..., 722.178, 763.669, 746.431],
[184.03 , 205.248, 166.694, ..., 676.678, 801.262, 679.783],
[165.079, 199.017, 184.376, ..., 703.413, 776.801, 719.73 ],
...,
[197.231, 199.102, 203.454, ..., 719.029, 779.193, 752.371],
[189.932, 188.009, 184.928, ..., 716.23 , 766.411, 809.908],
[196.893, 196.652, 194.255, ..., 696.815, 779.243, 728.557]],
[[179.314, 197.729, 177.804, ..., 693.157, 785.835, 704.754],
[181.139, 199.898, 203.366, ..., 705.809, 780.318, 664.055],
[219.453, 202.459, 171.729, ..., 699.242, 752.986, 681.345],
...,
[153.08 , 196.255, 190.642, ..., 696.894, 771.861, 757.343],
[176.134, 194.274, 180.019, ..., 702.82 , 771.696, 771.205],
[189.301, 201.191, 190.153, ..., 715.727, 761.482, 671.068]],
[[174.572, 197.823, 172.768, ..., 697.679, 767.644, 646.704],
[211.409, 195.457, 161.011, ..., 688.741, 760.749, 755.586],
[200.141, 200.543, 214.819, ..., 703.487, 765.846, 737.325],
...,
[237.692, 195.747, 250.614, ..., 703.1 , 772.312, 765.413],
[148.417, 201.899, 131.183, ..., 686.186, 795.613, 737.623],
[183.182, 205.971, 172.788, ..., 683.945, 759.276, 680.307]],
[[194.267, 192.521, 191.21 , ..., 713.829, 782.147, 880.286],
[179.725, 208.942, 182.553, ..., 677.761, 790.086, 648.158],
[198.912, 194.35 , 180.478, ..., 692.912, 764.381, 848.447],
...,
[200.28 , 200.065, 207.639, ..., 664.995, 771.708, 722.497],
[162.015, 199.619, 155.899, ..., 714.961, 794.521, 729.638],
[171.196, 202.455, 159.773, ..., 710.38 , 785.338, 776.962]]])PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
...
990, 991, 992, 993, 994, 995, 996, 997, 998, 999],
dtype='int64', name='draw', length=1000))PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23],
dtype='int64', name='volume_ls_dim_0'))PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23],
dtype='int64', name='tube'))<xarray.Dataset> Size: 264kB
Dimensions: (chain: 4, draw: 1000, obs: 8)
Coordinates:
* chain (chain) int64 32B 0 1 2 3
* draw (draw) int64 8kB 0 1 2 3 4 5 6 7 ... 993 994 995 996 997 998 999
* obs (obs) int64 64B 1 4 9 11 15 19 21 22
Data variables:
y_rep (chain, draw, obs) float64 256kB 194.0 169.6 277.7 ... 687.6 801.0
Attributes:
created_at: 2024-04-30T07:24:30.936747
arviz_version: 0.17.0
inference_library: cmdstanpy
inference_library_version: 1.2.1array([0, 1, 2, 3])
array([ 0, 1, 2, ..., 997, 998, 999])
array([ 1, 4, 9, 11, 15, 19, 21, 22])
array([[[194.019, 169.649, 277.702, ..., 870.471, 711.13 , 763.027],
[210.568, 175.395, 271.559, ..., 808.134, 677.842, 819.582],
[196.65 , 181.258, 296.44 , ..., 817.219, 712.216, 781.865],
...,
[198.22 , 174.531, 286.846, ..., 871.034, 712.863, 771.035],
[195.913, 172.949, 291.973, ..., 832.453, 705.383, 768.484],
[194.273, 172.032, 298.204, ..., 840.177, 679.01 , 783.601]],
[[198.638, 170.929, 277.314, ..., 841.19 , 697.767, 817.759],
[189.746, 166.394, 271.708, ..., 844.603, 709.865, 771.034],
[204.193, 180.485, 283.941, ..., 884.944, 680.183, 753.723],
...,
[188.425, 168.897, 276.549, ..., 854.137, 696.153, 784.88 ],
[194.64 , 180.233, 292.583, ..., 861.935, 709.016, 758.358],
[205.738, 176.185, 286.653, ..., 860.962, 705.323, 748.793]],
[[194.97 , 169.283, 295.481, ..., 819.818, 714.959, 792.203],
[198.075, 171.678, 283.181, ..., 886.779, 683.517, 750.009],
[199.833, 181.001, 286.676, ..., 836.956, 705.259, 761.851],
...,
[196.362, 175.71 , 274.998, ..., 871.97 , 675.318, 783.702],
[203.877, 168.994, 287.851, ..., 865.126, 673.657, 788.059],
[213.415, 167.349, 278.116, ..., 842.203, 706.18 , 755.227]],
[[196.505, 176.134, 292.787, ..., 859.59 , 732.783, 769.574],
[201.363, 176.912, 280.024, ..., 846.944, 684.181, 793.165],
[195.632, 176.309, 277.147, ..., 847.16 , 713.981, 750.206],
...,
[201.932, 180.803, 292.711, ..., 872.388, 669.98 , 769.098],
[195.984, 177.052, 286.623, ..., 854.997, 716.108, 819.1 ],
[205.122, 174.784, 270.119, ..., 844.897, 687.563, 801.038]]])PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
...
990, 991, 992, 993, 994, 995, 996, 997, 998, 999],
dtype='int64', name='draw', length=1000))PandasIndex(Index([1, 4, 9, 11, 15, 19, 21, 22], dtype='int64', name='obs'))
<xarray.Dataset> Size: 264kB
Dimensions: (chain: 4, draw: 1000, llik_dim_0: 8)
Coordinates:
* chain (chain) int64 32B 0 1 2 3
* draw (draw) int64 8kB 0 1 2 3 4 5 6 7 ... 993 994 995 996 997 998 999
* llik_dim_0 (llik_dim_0) int64 64B 0 1 2 3 4 5 6 7
Data variables:
llik (chain, draw, llik_dim_0) float64 256kB -2.542 -2.189 ... -3.678
Attributes:
created_at: 2024-04-30T07:24:30.979894
arviz_version: 0.17.0
inference_library: cmdstanpy
inference_library_version: 1.2.1array([0, 1, 2, 3])
array([ 0, 1, 2, ..., 997, 998, 999])
array([0, 1, 2, 3, 4, 5, 6, 7])
array([[[-2.54227, -2.1886 , -2.82927, ..., -4.59573, -5.54081,
-4.48391],
[-3.46536, -2.51029, -3.36554, ..., -4.75575, -4.34056,
-4.30591],
[-2.30072, -4.02094, -4.33531, ..., -4.10092, -3.7801 ,
-3.75999],
...,
[-2.30063, -2.18504, -3.20698, ..., -3.81615, -5.12864,
-3.7068 ],
[-6.39054, -2.41232, -2.80555, ..., -3.75599, -4.8011 ,
-4.27142],
[-2.48839, -2.37681, -3.37903, ..., -3.9243 , -3.57095,
-3.70594]],
[[-2.35835, -2.62049, -2.72866, ..., -4.39262, -3.55078,
-3.68261],
[-2.32177, -3.14514, -3.44059, ..., -5.08693, -3.91007,
-3.68997],
[-2.65529, -3.47961, -2.67513, ..., -4.97457, -3.62226,
-5.63035],
...
[-2.65642, -3.00944, -2.71436, ..., -3.75855, -3.76523,
-3.92454],
[-2.5482 , -2.92394, -3.22157, ..., -3.82116, -3.70625,
-3.96973],
[-3.74934, -2.9081 , -3.37917, ..., -3.8633 , -3.81131,
-4.89337]],
[[-3.70223, -2.69272, -3.342 , ..., -3.77142, -4.54951,
-3.67376],
[-5.22442, -2.2637 , -2.66227, ..., -3.75669, -4.24317,
-3.76085],
[-3.02265, -3.82299, -3.14116, ..., -4.11662, -3.55191,
-4.42553],
...,
[-2.33121, -2.26524, -2.68822, ..., -4.3502 , -5.81518,
-3.95319],
[-2.30984, -2.53049, -3.04029, ..., -4.33784, -4.66473,
-3.91893],
[-2.65455, -2.17913, -2.64854, ..., -5.0621 , -4.23631,
-3.67819]]])PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
...
990, 991, 992, 993, 994, 995, 996, 997, 998, 999],
dtype='int64', name='draw', length=1000))PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7], dtype='int64', name='llik_dim_0'))
<xarray.Dataset> Size: 204kB
Dimensions: (chain: 4, draw: 1000)
Coordinates:
* chain (chain) int64 32B 0 1 2 3
* draw (draw) int64 8kB 0 1 2 3 4 5 6 ... 994 995 996 997 998 999
Data variables:
lp (chain, draw) float64 32kB 24.25 20.65 ... 22.17 22.93
acceptance_rate (chain, draw) float64 32kB 0.5925 0.8697 ... 0.9066 0.9954
step_size (chain, draw) float64 32kB 0.3888 0.3888 ... 0.3905 0.3905
tree_depth (chain, draw) int64 32kB 3 3 3 4 3 3 4 3 ... 3 3 3 3 4 3 3
n_steps (chain, draw) int64 32kB 7 7 7 15 7 7 15 ... 15 7 7 15 7 7
diverging (chain, draw) bool 4kB False False False ... False False
energy (chain, draw) float64 32kB -13.03 -7.201 ... -10.38 -13.52
Attributes:
created_at: 2024-04-30T07:24:30.934936
arviz_version: 0.17.0
inference_library: cmdstanpy
inference_library_version: 1.2.1array([0, 1, 2, 3])
array([ 0, 1, 2, ..., 997, 998, 999])
array([[24.2495, 20.6461, 24.5573, ..., 27.9915, 22.4859, 29.9771],
[29.4156, 25.1177, 25.0694, ..., 26.8262, 19.113 , 25.4518],
[23.8134, 24.1238, 22.4094, ..., 13.4716, 22.8161, 19.9685],
[22.8856, 23.7445, 23.9917, ..., 24.8785, 22.1713, 22.9298]])array([[0.592459, 0.869672, 0.90333 , ..., 0.964612, 0.784633, 1. ],
[0.938408, 0.739341, 0.984552, ..., 0.811589, 0.889806, 0.922895],
[0.991445, 0.579492, 0.931716, ..., 0.855036, 0.982912, 0.545114],
[0.94494 , 0.979635, 0.953584, ..., 0.78852 , 0.9066 , 0.995373]])array([[0.388816, 0.388816, 0.388816, ..., 0.388816, 0.388816, 0.388816],
[0.350042, 0.350042, 0.350042, ..., 0.350042, 0.350042, 0.350042],
[0.49684 , 0.49684 , 0.49684 , ..., 0.49684 , 0.49684 , 0.49684 ],
[0.390549, 0.390549, 0.390549, ..., 0.390549, 0.390549, 0.390549]])array([[3, 3, 3, ..., 3, 3, 3],
[3, 4, 4, ..., 3, 3, 3],
[3, 3, 3, ..., 3, 3, 3],
[3, 3, 3, ..., 4, 3, 3]])array([[ 7, 7, 7, ..., 7, 7, 7],
[15, 15, 15, ..., 7, 15, 7],
[ 7, 7, 7, ..., 7, 15, 7],
[ 7, 7, 7, ..., 15, 7, 7]])array([[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]])array([[-13.0302 , -7.20071, -8.80308, ..., -13.1924 , -13.2195 ,
-14.3522 ],
[-14.2885 , -12.5602 , -12.1762 , ..., -11.4171 , -13.3478 ,
-6.47956],
[-11.3062 , -10.3593 , -8.63512, ..., -2.58044, -3.596 ,
-5.85565],
[ -7.85902, -14.9481 , -14.1336 , ..., -10.5096 , -10.3834 ,
-13.5222 ]])PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
...
990, 991, 992, 993, 994, 995, 996, 997, 998, 999],
dtype='int64', name='draw', length=1000))<xarray.Dataset> Size: 2MB
Dimensions: (chain: 4, draw: 1000, volume_ls_dim_0: 24, tube: 24,
obs: 8, llik_dim_0: 8)
Coordinates:
* chain (chain) int64 32B 0 1 2 3
* draw (draw) int64 8kB 0 1 2 3 4 5 6 ... 994 995 996 997 998 999
* volume_ls_dim_0 (volume_ls_dim_0) int64 192B 0 1 2 3 4 5 ... 19 20 21 22 23
* tube (tube) int64 192B 0 1 2 3 4 5 6 7 ... 17 18 19 20 21 22 23
* obs (obs) int64 64B 1 4 9 11 15 19 21 22
* llik_dim_0 (llik_dim_0) int64 64B 0 1 2 3 4 5 6 7
Data variables:
volume_noise_s (chain, draw) float64 32kB 0.02974 0.02728 ... 0.2746
bias_factor_l (chain, draw) float64 32kB -0.1025 -0.07485 ... -0.1921
volume_ls (chain, draw, volume_ls_dim_0) float64 768kB -1.18 ... 0...
bias_factor (chain, draw) float64 32kB 0.9026 0.9279 ... 0.9219 0.8252
volume_noise (chain, draw) float64 32kB 0.0182 0.01669 ... 0.1236 0.168
volume (chain, draw, tube) float64 768kB 193.8 185.0 ... 693.1
y_rep (chain, draw, obs) float64 256kB 184.8 189.1 ... 1.002e+03
llik (chain, draw, llik_dim_0) float64 256kB -8.981 ... -87.32
Attributes:
created_at: 2024-04-30T07:24:30.972737
arviz_version: 0.17.0
inference_library: cmdstanpy
inference_library_version: 1.2.1array([0, 1, 2, 3])
array([ 0, 1, 2, ..., 997, 998, 999])
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23])array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23])array([ 1, 4, 9, 11, 15, 19, 21, 22])
array([0, 1, 2, 3, 4, 5, 6, 7])
array([[0.0297445, 0.0272758, 0.0265095, ..., 0.178736 , 0.192878 ,
0.143125 ],
[0.0217743, 0.0275153, 0.0318496, ..., 0.187067 , 0.213967 ,
0.2151 ],
[0.14644 , 0.161344 , 0.154124 , ..., 0.222906 , 0.210747 ,
0.183849 ],
[0.196802 , 0.224444 , 0.332829 , ..., 0.104148 , 0.202053 ,
0.274571 ]])array([[-0.102507 , -0.074851 , -0.071183 , ..., -0.0135833, -0.072937 ,
-0.0888987],
[-0.131403 , -0.137815 , -0.131853 , ..., -0.0773899, 0.0578722,
-0.0226735],
[ 0.134711 , 0.0641626, -0.0368078, ..., -0.277545 , -0.22355 ,
-0.29311 ],
[-0.0320055, -0.127452 , 0.0724585, ..., -0.0552708, -0.0813462,
-0.192088 ]])array([[[-1.18014 , -1.25596 , -1.21539 , ..., 1.06275 , 1.04836 ,
1.03579 ],
[-1.24896 , -1.18338 , -1.21905 , ..., 1.03778 , 1.05845 ,
1.05922 ],
[-1.18062 , -1.20861 , -1.20187 , ..., 1.06469 , 1.05634 ,
1.01356 ],
...,
[-1.06469 , -1.19225 , -1.39826 , ..., 0.878588, 1.39824 ,
1.34754 ],
[-1.43398 , -1.40853 , -1.21943 , ..., 1.50548 , 0.928725,
0.986833],
[-1.19609 , -1.28651 , -1.26602 , ..., 0.972567, 1.07923 ,
1.10071 ]],
[[-1.24692 , -1.24356 , -1.21965 , ..., 1.00025 , 0.983762,
1.02885 ],
[-1.28328 , -1.29047 , -1.25435 , ..., 1.07908 , 0.979001,
0.99518 ],
[-1.24983 , -1.2417 , -1.28149 , ..., 0.913942, 1.01969 ,
1.0017 ],
...
[-1.28413 , -1.1961 , -1.73514 , ..., 1.07113 , 1.08167 ,
0.796746],
[-1.53629 , -1.42865 , -1.07195 , ..., 0.73767 , 0.848461,
1.2325 ],
[-1.33974 , -1.18125 , -1.43554 , ..., 0.932523, 0.996232,
0.864258]],
[[-1.04915 , -0.949361, -0.929134, ..., 1.15517 , 0.924563,
0.388474],
[-1.26699 , -1.06466 , -1.12933 , ..., 0.873924, 0.755863,
1.21534 ],
[-1.1652 , -1.53275 , -1.46456 , ..., 1.52952 , 1.5088 ,
1.27139 ],
...,
[-1.14013 , -1.28078 , -1.14028 , ..., 1.01727 , 1.10426 ,
0.960508],
[-1.4791 , -1.09886 , -1.39305 , ..., 0.918405, 1.08038 ,
1.22328 ],
[-1.2026 , -1.02384 , -1.02024 , ..., 1.01766 , 1.52503 ,
0.902215]]])array([[0.902572, 0.927882, 0.931291, ..., 0.986509, 0.929659, 0.914938],
[0.876865, 0.87126 , 0.87647 , ..., 0.925529, 1.05958 , 0.977582],
[1.14421 , 1.06627 , 0.963861, ..., 0.757641, 0.799675, 0.74594 ],
[0.968501, 0.880336, 1.07515 , ..., 0.946229, 0.921874, 0.825234]])array([[0.0182013, 0.0166906, 0.0162217, ..., 0.109372 , 0.118026 ,
0.0875808],
[0.0133241, 0.0168372, 0.0194894, ..., 0.11447 , 0.130931 ,
0.131624 ],
[0.0896095, 0.0987295, 0.0943116, ..., 0.136401 , 0.12896 ,
0.112501 ],
[0.120427 , 0.137342 , 0.203665 , ..., 0.0637301, 0.12364 ,
0.168016 ]])array([[[ 193.83 , 185.043, 189.693, ..., 764.673, 757.97 ,
752.164],
[ 185.837, 193.446, 189.27 , ..., 753.078, 762.666,
763.023],
[ 193.772, 190.482, 191.27 , ..., 765.581, 761.682,
742.002],
...,
[ 208.018, 192.399, 169.611, ..., 683.179, 938.933,
910.251],
[ 165.943, 168.549, 189.225, ..., 1002.61 , 704.464,
729.963],
[ 191.947, 181.615, 183.906, ..., 723.619, 772.422,
782.645]],
[[ 186.069, 186.452, 189.2 , ..., 735.982, 728.593,
748.973],
[ 181.974, 181.175, 185.225, ..., 772.355, 726.473,
733.701],
[ 185.738, 186.664, 182.174, ..., 698.12 , 744.787,
736.633],
...
[ 181.879, 191.946, 138.015, ..., 768.607, 773.58 ,
649.808],
[ 155.873, 166.485, 207.096, ..., 626.737, 670.7 ,
848.373],
[ 175.795, 193.698, 165.786, ..., 706.103, 734.174,
677.215]],
[[ 210.005, 223.229, 226.009, ..., 809.167, 702.672,
506.157],
[ 183.797, 208.022, 199.951, ..., 681.232, 633.753,
839.513],
[ 195.61 , 156.211, 162.867, ..., 1017.47 , 1004.65 ,
868.806],
...,
[ 198.634, 182.253, 198.615, ..., 743.688, 784.348,
718.299],
[ 161.425, 203.714, 170.153, ..., 700.029, 772.966,
843.605],
[ 191.183, 213.283, 213.753, ..., 743.862, 1014.68 ,
693.128]]])array([[[ 184.813, 189.054, 386.057, ..., 759.48 , 754.107,
752.463],
[ 188.057, 194.853, 377.611, ..., 747.653, 741.944,
791.905],
[ 191.471, 191.019, 380.344, ..., 762.074, 750.234,
764.223],
...,
[ 188.237, 217.168, 422.702, ..., 804.179, 687.31 ,
936.802],
[ 167.074, 207.593, 425.475, ..., 681.052, 1021.48 ,
705.951],
[ 177.321, 174.68 , 365.093, ..., 756.83 , 745.255,
747.625]],
[[ 194.119, 186.317, 365.097, ..., 722.354, 724.962,
745.609],
[ 183.759, 188.886, 368.915, ..., 750.265, 771.006,
708.641],
[ 188.523, 182.887, 388.006, ..., 759.472, 700.728,
735.546],
...
[ 187.093, 217.335, 268.847, ..., 600.192, 760.567,
755.355],
[ 165.12 , 131.573, 447.16 , ..., 730.651, 634.832,
670.844],
[ 194.342, 155.376, 278.762, ..., 557.058, 670.72 ,
756.81 ]],
[[ 227.482, 215.861, 369.136, ..., 876.322, 807.167,
691.492],
[ 204.28 , 253.91 , 468.703, ..., 851.316, 673.234,
629.031],
[ 158.707, 185.01 , 421.974, ..., 1219.31 , 1041.13 ,
994.95 ],
...,
[ 181.02 , 216.746, 403.83 , ..., 735.791, 757.416,
787.206],
[ 204.988, 169.965, 321.8 , ..., 875.623, 693.14 ,
777.141],
[ 215.283, 205.664, 494.475, ..., 677.682, 740.974,
1001.79 ]]])array([[[ -8.98149, -8.16041, -121.834 , ..., -26.6511 ,
-15.333 , -5.03173],
[ -3.3298 , -6.74602, -120.175 , ..., -18.3798 ,
-11.9159 , -4.56985],
[ -4.73572, -11.8407 , -113.21 , ..., -18.8408 ,
-15.6229 , -4.65854],
...,
[ -3.75606, -57.2464 , -186.1 , ..., -4.7875 ,
-3.85341, -44.6675 ],
[ -36.9401 , -34.6153 , -209.275 , ..., -58.3687 ,
-172.836 , -17.7702 ],
[ -12.8355 , -2.24944, -85.4265 , ..., -17.943 ,
-5.74474, -3.91947]],
[[ -7.66673, -5.33243, -94.05 , ..., -33.5105 ,
-7.87838, -10.2459 ],
[ -13.3995 , -6.43127, -83.7495 , ..., -37.2591 ,
-17.8842 , -10.7848 ],
[ -7.4818 , -3.62121, -98.3658 , ..., -24.9407 ,
-3.5948 , -6.86394],
...
[ -3.96411, -65.3276 , -6.23833, ..., -161.946 ,
-16.6114 , -3.86945],
[ -42.2559 , -75.7815 , -300.021 , ..., -44.6827 ,
-16.5108 , -33.8265 ],
[ -3.23865, -16.1718 , -4.7924 , ..., -249.053 ,
-3.92796, -8.93514]],
[[ -18.6896 , -57.0662 , -102.733 , ..., -4.63053,
-33.0595 , -18.4545 ],
[ -4.71529, -167.057 , -299.696 , ..., -5.00925,
-3.97491, -59.8419 ],
[ -75.8 , -3.83045, -201.039 , ..., -167.683 ,
-186.641 , -81.0174 ],
...,
[ -12.046 , -44.6544 , -129.648 , ..., -32.7041 ,
-9.54657, -3.67239],
[ -2.96314, -4.2949 , -27.2165 , ..., -4.11057,
-3.64532, -3.89527],
[ -8.2396 , -22.3636 , -384.246 , ..., -74.6057 ,
-9.58703, -87.318 ]]])PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
...
990, 991, 992, 993, 994, 995, 996, 997, 998, 999],
dtype='int64', name='draw', length=1000))PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23],
dtype='int64', name='volume_ls_dim_0'))PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23],
dtype='int64', name='tube'))PandasIndex(Index([1, 4, 9, 11, 15, 19, 21, 22], dtype='int64', name='obs'))
PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7], dtype='int64', name='llik_dim_0'))
<xarray.Dataset> Size: 204kB
Dimensions: (chain: 4, draw: 1000)
Coordinates:
* chain (chain) int64 32B 0 1 2 3
* draw (draw) int64 8kB 0 1 2 3 4 5 6 ... 994 995 996 997 998 999
Data variables:
lp (chain, draw) float64 32kB 72.09 71.09 ... 23.28 14.77
acceptance_rate (chain, draw) float64 32kB 0.9856 0.9735 ... 0.9808 0.99
step_size (chain, draw) float64 32kB 0.03423 0.03423 ... 0.04274
tree_depth (chain, draw) int64 32kB 4 4 4 4 4 4 4 4 ... 7 6 7 7 6 6 7
n_steps (chain, draw) int64 32kB 15 31 15 15 15 ... 127 63 63 127
diverging (chain, draw) bool 4kB False False False ... False False
energy (chain, draw) float64 32kB -58.38 -61.21 ... -14.09 -3.941
Attributes:
created_at: 2024-04-30T07:24:30.975978
arviz_version: 0.17.0
inference_library: cmdstanpy
inference_library_version: 1.2.1array([0, 1, 2, 3])
array([ 0, 1, 2, ..., 997, 998, 999])
array([[72.0913, 71.0934, 72.5368, ..., 22.7741, 19.717 , 31.5704],
[78.6388, 67.3879, 66.6947, ..., 27.7493, 23.3811, 21.9996],
[32.2751, 29.5342, 31.5931, ..., 21.5172, 20.5746, 25.7496],
[19.4916, 19.317 , 10.9252, ..., 36.2183, 23.2776, 14.7731]])array([[0.985585, 0.973493, 0.926252, ..., 0.999961, 0.998114, 0.999975],
[0.987531, 0.826221, 0.990511, ..., 0.999879, 0.999778, 0.999777],
[0.989942, 0.996745, 0.999915, ..., 0.997515, 0.999232, 0.999938],
[0.999604, 0.99648 , 0.998834, ..., 0.999231, 0.980821, 0.99004 ]])array([[0.0342255, 0.0342255, 0.0342255, ..., 0.0342255, 0.0342255,
0.0342255],
[0.0211665, 0.0211665, 0.0211665, ..., 0.0211665, 0.0211665,
0.0211665],
[0.0399343, 0.0399343, 0.0399343, ..., 0.0399343, 0.0399343,
0.0399343],
[0.0427424, 0.0427424, 0.0427424, ..., 0.0427424, 0.0427424,
0.0427424]])array([[4, 4, 4, ..., 7, 7, 7],
[5, 5, 5, ..., 8, 8, 8],
[6, 7, 6, ..., 7, 7, 7],
[7, 7, 7, ..., 6, 6, 7]])array([[ 15, 31, 15, ..., 127, 255, 127],
[ 31, 31, 31, ..., 255, 255, 255],
[ 63, 255, 63, ..., 127, 127, 127],
[127, 127, 127, ..., 63, 63, 127]])array([[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]])array([[-5.83823e+01, -6.12113e+01, -6.32211e+01, ..., -1.00715e+01,
3.57843e-02, -1.26730e+01],
[-7.05147e+01, -5.28132e+01, -5.87134e+01, ..., -1.31417e+01,
-1.17634e+01, -5.75708e+00],
[-2.44030e+01, -1.81596e+01, -2.17799e+01, ..., -9.25289e+00,
-6.95541e+00, -9.89278e+00],
[-5.78982e+00, -5.81860e+00, -6.76507e-01, ..., -1.92249e+01,
-1.40883e+01, -3.94102e+00]])PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
...
990, 991, 992, 993, 994, 995, 996, 997, 998, 999],
dtype='int64', name='draw', length=1000))<xarray.Dataset> Size: 896B
Dimensions: (N_dim_0: 1, N_cal_dim_0: 1, tube: 24, obs: 8,
measurement_ix_dim_0: 8, cal_error_dim_0: 1,
likelihood_dim_0: 1)
Coordinates:
* N_dim_0 (N_dim_0) int64 8B 0
* N_cal_dim_0 (N_cal_dim_0) int64 8B 0
* tube (tube) int64 192B 0 1 2 3 4 5 6 ... 18 19 20 21 22 23
* obs (obs) int64 64B 1 4 9 11 15 19 21 22
* measurement_ix_dim_0 (measurement_ix_dim_0) int64 64B 0 1 2 3 4 5 6 7
* cal_error_dim_0 (cal_error_dim_0) int64 8B 0
* likelihood_dim_0 (likelihood_dim_0) int64 8B 0
Data variables:
N (N_dim_0) int64 8B 24
N_cal (N_cal_dim_0) int64 8B 8
target_volume (tube) int64 192B 200 200 200 200 ... 800 800 800 800
y (obs) float64 64B 199.1 176.3 281.9 ... 693.9 783.4
measurement_ix (measurement_ix_dim_0) int64 64B 2 5 10 12 16 20 22 23
cal_error (cal_error_dim_0) float64 8B 0.02
likelihood (likelihood_dim_0) int64 8B 1
true_volume (tube) float64 192B 152.6 199.7 161.3 ... 770.5 598.6
Attributes:
created_at: 2024-04-30T07:24:30.978435
arviz_version: 0.17.0
inference_library: cmdstanpy
inference_library_version: 1.2.1array([0])
array([0])
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23])array([ 1, 4, 9, 11, 15, 19, 21, 22])
array([0, 1, 2, 3, 4, 5, 6, 7])
array([0])
array([0])
array([24])
array([8])
array([200, 200, 200, 200, 200, 200, 200, 200, 400, 400, 400, 400, 400,
400, 400, 400, 800, 800, 800, 800, 800, 800, 800, 800])array([199.07727295, 176.25632772, 281.87757567, 387.16351159,
362.14946827, 853.23878527, 693.91934176, 783.39963414])array([ 2, 5, 10, 12, 16, 20, 22, 23])
array([0.02])
array([1])
array([152.64294349, 199.70810807, 161.32449302, 171.49715397,
174.67894069, 163.43175945, 153.50063823, 187.79919405,
364.94147086, 289.55488638, 445.13256694, 387.79655766,
326.2592979 , 385.23402356, 335.94110379, 349.87019831,
761.78380169, 620.86368119, 745.73037525, 809.71007835,
803.52489045, 683.21425317, 770.52360505, 598.61585552])PandasIndex(Index([0], dtype='int64', name='N_dim_0'))
PandasIndex(Index([0], dtype='int64', name='N_cal_dim_0'))
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23],
dtype='int64', name='tube'))PandasIndex(Index([1, 4, 9, 11, 15, 19, 21, 22], dtype='int64', name='obs'))
PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7], dtype='int64', name='measurement_ix_dim_0'))
PandasIndex(Index([0], dtype='int64', name='cal_error_dim_0'))
PandasIndex(Index([0], dtype='int64', name='likelihood_dim_0'))
Next we look at the summaries of both the posterior and prior.
for group_name in ["prior", "posterior"]:
group = idata.get(group_name)
group_summary = az.summary(
group,
var_names=[
"volume_noise", "bias_factor", "volume_noise_s", "bias_factor_l"
]
)
display(group_summary)| mean | sd | hdi_3% | hdi_97% | mcse_mean | mcse_sd | ess_bulk | ess_tail | r_hat | |
|---|---|---|---|---|---|---|---|---|---|
| volume_noise | 0.066 | 0.034 | 0.016 | 0.128 | 0.002 | 0.001 | 224.0 | 380.0 | 1.01 |
| bias_factor | 0.994 | 0.143 | 0.731 | 1.252 | 0.015 | 0.010 | 80.0 | 120.0 | 1.06 |
| volume_noise_s | 0.108 | 0.055 | 0.026 | 0.210 | 0.003 | 0.002 | 224.0 | 380.0 | 1.01 |
| bias_factor_l | -0.016 | 0.141 | -0.286 | 0.242 | 0.015 | 0.011 | 80.0 | 120.0 | 1.06 |
| mean | sd | hdi_3% | hdi_97% | mcse_mean | mcse_sd | ess_bulk | ess_tail | r_hat | |
|---|---|---|---|---|---|---|---|---|---|
| volume_noise | 0.112 | 0.026 | 0.067 | 0.159 | 0.001 | 0.001 | 1149.0 | 1754.0 | 1.0 |
| bias_factor | 0.887 | 0.055 | 0.780 | 0.987 | 0.001 | 0.001 | 2284.0 | 2339.0 | 1.0 |
| volume_noise_s | 0.183 | 0.042 | 0.109 | 0.259 | 0.001 | 0.001 | 1149.0 | 1754.0 | 1.0 |
| bias_factor_l | -0.122 | 0.062 | -0.231 | 0.002 | 0.001 | 0.001 | 2284.0 | 2339.0 | 1.0 |
This plot compares the measurements with the observations.
az.plot_lm(
y=idata.observed_data["y"],
x=idata.observed_data["tube"].sel(tube=MEASUREMENT_IX + 1),
y_hat=idata.posterior_predictive["y_rep"]
)
ax = plt.gca()
ax.semilogy()This plot compares the volume_noise and bias_factor samples with the true values that we used to simulate the data.
az.plot_posterior(
idata.prior,
var_names=["volume_noise", "bias_factor"],
kind="hist",
hdi_prob="hide",
point_estimate=None,
figsize=[12, 4]
)
f = plt.gcf()
axes = f.axes
az.plot_posterior(
idata.posterior,
var_names=["volume_noise", "bias_factor"],
kind="hist",
hdi_prob="hide",
point_estimate=None,
figsize=[12, 4],
ax=axes,
color="tab:orange"
)
for ax, truth in zip(f.axes, [NOISE, BIAS_FACTOR]):
ax.axvline(truth, color="red")This plot shows the samples for all the tubes’ volumes, including those that weren’t measured, alongside the true volumes.
az.plot_lm(
x=idata.observed_data["tube"],
y=idata.observed_data["true_volume"],
y_hat=idata.posterior["volume"],
grid=False,
y_kwargs={"label": "true volume"},
figsize=[10, 5],
legend=False,
)
ax = plt.gca()
for i in MEASUREMENT_IX:
ax.text(i+0.1, volumes[i], "obs", zorder=1000)
ax.set(xlabel="tube", ylabel="volume ($\\mu$l)");
ax.semilogy()So, what is the probability that Teddy put less than 350 \(\mu\)l of label into tube 10, even though the target amount was 400\(\mu\)l?
az.plot_posterior(
idata.prior,
var_names=["volume"],
coords={"tube": [10]},
kind="hist",
hdi_prob="hide",
point_estimate=None,
ref_val=350,
figsize=[12, 4],
bins=np.linspace(250, 600, 30),
)Phew, only about 13%, that’s probably fine right?